MacFormat 1996 January

home *** CD-ROM | disk | FTP | other *** search

/ MacFormat 1996 January / macformat-033.iso / mac / Shareware City / Developers / VideoToolbox / VideoToolboxSources / CopyBitsQuickly.c < prev next >

Wrap

Text File | 1995-08-13 | 24.2 KB | 631 lines | [TEXT/CWIE]

/* CopyBitsQuickly.c CopyBitsQuickly.c is a dumb substitute for CopyBits that ignores the color tables and palettes, simply copying the raw pixels without any translation. It's for doing animations. (Try the demo Sandstorm.) Besides copying images, it can also add or multiply them. At one time it copied much faster than CopyBits did, but the latest timing (under System 7), by TimeVideo, indicates that they are of approximately equal speed. Apple's CopyBits is an Apple Macintosh Toolbox routine for copying images, and is documented in Inside Macintosh Volumes I,V, and VI, and New Inside Macintosh: "Imaging with QuickDraw". CopyBitsQuickly does not cause the Memory Manager to move memory, and thus may be used in a VBL task. I suggest that you use the higher-level interface provided by the VideoToolbox CopyWindows.c, which saves you from getting your hands dirty messing with pixmaps. You can just deal with windows and GWorlds. The returned value is nonzero if an error occurred. CopyBitsQuickly supports four modes: • srcCopy copies the source to the destination. • addOver adds the source to the destination. Both must have 8-bit pixels. Overflow is ignored. • addOverParallel adds the source to the destination (4 bytes at a time), i.e. parallel addition. Overflow may carry over into neighboring pixels within the image. Supports all pixel sizes. • mulOver causes the source and destination to be multiplied, pixel by pixel. Both must have 8-bit pixels. After multiplication, the product is divided by 128 and stored in the destination. Overflow is ignored. All the arithmetic is unsigned. RESTRICTIONS: • srcBits and dstBits must both have the same number of bits/pixel. • dstRect and srcRect must have the same size. • mode must be either srcCopy, addOver, addOverParallel, or mulOver. • maskRgn must be NULL. • If mode is addOver or mulOver then the pixel size must be 8 bits. • If CopyBitsQuickly detects a violation of any of these restrictions it will return a nonzero value, indicating that an error occured. RETURNED VALUE: 0 Success. 1 Illegal srcMode (only srcCopy, addOver, and mulOver are allowed). 2 maskRgn!=NULL. 3 Source and destination rects are of unequal size. 4 After clipping there were no pixels to copy, or RectToAddress couldn't resolve address of source or destination. 5 Source and destination have unequal pixel sizes. 6 We need 32-bit addressing but it's not available. 7 This mode requires 8-bit pixels and the supplied pixel size is not 8 bits. ACKNOWLEDGEMENTS: I learned the trick of using a switch() to jump into a loop from Bill Karsh's solution to the April 1994 MacTech Programmer's Challenge. LIMITATIONS: • If a Rect extends across multiple screens, only as much of the upper-left of the Rect that's on one device will be used. The rest is clipped off. • When accessing a screen, CopyBitsQuickly() ought to, but doesn't, call ShieldCursor() to remove the cursor from the part of the screen it's reading or writing. Calling ShieldCursor would also have the desirable side effect of informing nonstandard video devices, like the Radius PowerView, that the screen has been updated. (NOTE: CopyWindows does this for you before calling CopyBitsQuickly.) NOTE: For highest speed you should choose your srcRectPtr & dstRectPtr so that the first point moved to and from each row begins at a memory address that is a multiple of 4 bytes. The effect on speed is substantial, about 25%. NOTE: If your computer boots in 24-bit mode, as set by the Memory Control Panel, then the THINK C Debugger will crash if it's activated while you've temporarily switched into 32-bit mode. So don't put any breakpoints in any section of code that's bracketed by calls to SwapMMUMode() unless your computer booted up in 32-bit mode. If your computer boots in 32-bit mode then the calls to SwapMMUMode do nothing, and you can put Debugger breakpoints anywhere. BLOCKMOVEDATA: NOT FASTER BlockMoveData is a new (as of System Update 3.0 to System 7) variant of BlockMove that omits cache flushing. Issuing BlockMoveData() on earlier versions of the operating system will invoke plain old BlockMove(). BlockMoveData, like BlockMove, uses the MOVE16 instruction, on computers that have it, so it could potentially be faster than the generic code that most compilers produce. However, I haven't found any speed advantage on the PowerBook 170, Mac II, IIfx, and Power Mac in my lab (none of which have the 68040, which is the only cpu that has MOVE16). I haven't tried it on a Quadra. To my surprise, BlockMoveData isn't faster than my C loop on the Mac II, IIfx, or Power Mac 6100/60, and is distinctly slower in a few cases (e.g. 1 bit mode on Toby card), so I've disabled it. So the following 2 paragraphs are moot. IGNORE: CopyBitsQuickly.c, if possible, now uses Apple's BlockMoveData() for highest possible speed on all Macs. For best performance you should set your Memory Control Panel to 32-bit addressing, and you should install Apple's System Update 3.0, which requires System 7.1, or whatever System release supercedes it. If your Mac is very old, e.g. a Mac II, you may need to install the freely available MODE32 init in order to be able to enable 32-bit addressing. IGNORE: I've disabled the use of BlockMoveData if the computer has a 68040 processor, yet is a Mac II. I do this because BlockMoveData crashes on my Radius Rocket (68040 processor on a NuBus card) in my Mac II when handed addresses in video memory, even though MODE32 is installed. Presumably this indicates that the BlockMove routine is not 32-bit clean, despite the runtime patches installed by MODE32. This is puzzling since BlockMove works fine accessing the same video addresses in either 24 or 32-bit mode (with MODE32) without the Rocket. Copyright ©1989-1995 Denis G. Pelli. HISTORY: 1/89 dgp Version 2.0: added support for PixMaps and multiple screens. Added checking. 6/89 dgp Version 3.0: now use RectToAddress, which clips to one device. 10/89 dgp Version 3.5: Improved resolution from longs to bytes. 10/89 dgp Version 4.0: Added new mode: addOver 3/90 dgp Version 4.01: Made cosmetic changes: renamed srcRect & dstRect to srcRectPtr and dstRectPtr. renamed srcAdd to addOver, to conform to CopyBits. added a few more comments to explain the initial clipping. 3/20/90 dgp made compatible with MPW C. 4/20/90 dgp now uses 32-bit addressing only if QD32 is present. 4/9/91 dgp v 4.05: changed nudge from short to long, just to be safe 8/24/91 dgp Made compatible with THINK C 5.0. 4/15/92 dgp Updated CopyBitsQuickly's function header to Standard C style. 10/5/92 dgp Dropped support for THINK C 4. Updated the documentation above. 12/2/92 dgp cosmetic changes 12/8/92 dgp fixed major gaffe introduced on 12/2/92: "case" prefix was missing in switch statement. This caused CopyBitsQuickly to do nothing. 1/31/93 dgp Added new "multiplyQuickly" mode requested by Josh Solomon. Now insist on 8-bit pixels for both addOver and multiplyQuickly modes. 2/18/93 js added mulOver to list of allowed modes. (Oops! - dgp.) Works ok now. 2/18/93 dgp Now return int, nonzero if error occurred. 7/9/93 dgp check for 32-bit addressing capability. 6/5/94 dgp Replaced all assembly code by portable C code of similar speed. Only call SwapMMUMode() if we must. Give error if we need 32-bit mode and it's not available. Documented the returned value. 6/7/94 dgp Added code to use Apple's BlockMoveData() for highest possible speed on all Macs, but disabled it because it didn't turn out to be faster on the machines on which I've tested it: Mac II, IIfx, and Power Mac 6100/60. 6/7/94 dgp Added new mode "addOverParallel" which accepts any pixelSize and adds source to destination very quickly by adding 4 bytes at a time. 6/14/94 dgp can32 is now computed by calling TrapAvailable(_SwapMMUMode), which returns the correct answer even on Macs with dirty ROMs. 5/23/95 dgp Apple changed the prototype in the header file from SwapMMUMode(char *) to SwapMMUMode(signed char *). To retain compatibility with both old and new headers, I cast the argument (void *). */ #include "VideoToolbox.h" void ReadPixels(int x,int y,int n,unsigned long *value ,unsigned char *baseAddr,long pixelSize,long rowBytes); void WritePixels(int x,int y,int n,unsigned long *value ,unsigned char *baseAddr,long pixelSize,long rowBytes); // The srcMode constants addOverParallel and mulOver are defined in VideoToolbox.h #ifndef __TRAPS__ #include <Traps.h> // _SwapMMUMode #endif #if (THINK_C || THINK_CPLUS || SYMANTEC_C) // These THINK C options seem to have very little effect on the code produced. // However, if you don't disable "assign_registers" then one of the variables // declared "register" in srcCopyQuickly() fails to be assigned to a register. #pragma options(!assign_registers,honor_register,redundant_loads,defer_adjust) #pragma options(global_optimizer,gopt_induction,gopt_loop,gopt_cse,gopt_coloring) #endif typedef unsigned char *UPtr; static void Expand8(double hMag,double vMag,double hOffset,double vOffset ,register UPtr Src,Rect *srcRect,unsigned long srcRowBytes ,register UPtr Dst,Rect *dstRect,unsigned long dstRowBytes,Boolean do32); static void Expand(double hMag,double vMag,double hOffset,double vOffset ,register UPtr Src,Rect *srcRect,int srcPixelBits,unsigned long srcRowBytes ,register UPtr Dst,Rect *dstRect,int dstPixelBits,unsigned long dstRowBytes,Boolean do32); static void SrcCopyQuickly(UPtr Src,unsigned long srcinc, UPtr Dst,unsigned long dstinc, unsigned long bytes,unsigned long lines,Boolean do32); static void SrcCopyQuickly2(UPtr Src,unsigned long srcinc, UPtr Dst,unsigned long dstinc, unsigned long bytes,unsigned long lines,Boolean do32); static void AddOverParallel(UPtr Src,unsigned long srcinc, UPtr Dst,unsigned long dstinc, unsigned long bytes,unsigned long lines,Boolean do32); static void AddOver8(UPtr Src,unsigned long srcinc, UPtr Dst,unsigned long dstinc, unsigned long bytes,unsigned long lines,Boolean do32); static void MulOver8(UPtr Src,unsigned long srcinc, UPtr Dst,unsigned long dstinc, unsigned long bytes,unsigned long lines,Boolean do32); int CopyBitsQuickly(BitMap *srcBits,BitMap *dstBits ,Rect *srcRectPtr,Rect *dstRectPtr,long srcMode,RgnHandle maskRgn) { UPtr Src,Dst; long srcinc,dstinc; unsigned long lines; short srcRowBytes,dstRowBytes,srcPixelSize,dstPixelSize,srcBitsOffset,dstBitsOffset; Rect mySrcRect,myDstRect; int hOffset,vOffset; double hMag,vMag; long nudge,bytes; Boolean do32,useBlockMove; static Boolean can32,is32,wantBlockMove,firstTime=1; long error,addressing,machine,processor; srcMode&=0xffff; // upper bits are used only by CopyWindows. if(srcMode != srcCopy && srcMode != addOver && srcMode != addOverParallel && srcMode != mulOver) return 1; if(maskRgn != NULL) return 2; /* clip the rect to be copied by the bounds of source and destination */ mySrcRect=*srcRectPtr; myDstRect=*dstRectPtr; hMag=(double)(myDstRect.right-myDstRect.left)/(mySrcRect.right-mySrcRect.left); vMag=(double)(myDstRect.bottom-myDstRect.top)/(mySrcRect.bottom-mySrcRect.top); if((vMag!=1 || hMag!=1) && srcMode!=srcCopy)return 3; /* first make sure that srcRect and dstRect are the same size */ // if(mySrcRect.bottom-mySrcRect.top != myDstRect.bottom-myDstRect.top || // mySrcRect.right-mySrcRect.left != myDstRect.right-myDstRect.left) // return 3; hOffset=myDstRect.left-mySrcRect.left*hMag; vOffset=myDstRect.top-mySrcRect.top*vMag; /* clip myDstRect */ Dst = RectToAddress((PixMap *)dstBits,&myDstRect,&dstRowBytes,&dstPixelSize,&dstBitsOffset); /* This prevents writing outside the destination. The cost is that part of the inside will not be written. The problem arises because this routine's code can only write whole bytes, and the boundary may be in the middle of a byte. So, rather than writing an extra fraction of a byte (outside the destination rect) we leave the byte alone and fail to update a small portion inside the destination rect. */ if(dstBitsOffset>0) { nudge=(7+dstBitsOffset)/8; dstBitsOffset -= nudge*8; Dst += nudge; myDstRect.left += nudge*8/dstPixelSize; } /* Copy any clipping of myDstRect over to mySrcRect */ mySrcRect=myDstRect; ExpandAndOffsetRect(&mySrcRect,1/hMag,1/vMag,-hOffset/hMag,-vOffset/vMag); /* clip mySrcRect */ Src=RectToAddress((PixMap *)srcBits,&mySrcRect ,&srcRowBytes,&srcPixelSize,&srcBitsOffset); /* Copy any clipping of mySrcRect back to myDstRect */ myDstRect=mySrcRect; ExpandAndOffsetRect(&myDstRect,hMag,vMag,hOffset,vOffset); Dst=RectToAddress((PixMap *)dstBits,&myDstRect ,&dstRowBytes,&dstPixelSize,&dstBitsOffset); if(Src==NULL || Dst==NULL) return 4; if(srcPixelSize != dstPixelSize && srcMode!=srcCopy) return 5; bytes = mySrcRect.right - mySrcRect.left; /* number of pixels per line */ bytes *= srcPixelSize; /* number of bits per line */ bytes /= 8; /* number of bytes per line */ srcinc = srcRowBytes - bytes; /* offset in bytes to beginning of next line */ dstinc = dstRowBytes - bytes; lines=mySrcRect.bottom - mySrcRect.top; /* number of lines */ if(srcinc==0 && dstinc==0){ bytes*=lines; lines=1; } if(firstTime){ can32=TrapAvailable(_SwapMMUMode); addressing=0; error=Gestalt(gestaltAddressingModeAttr,&addressing); is32=addressing&(1L<<gestalt32BitAddressing); // My tests indicate little or no advantage, even on PowerPC Macs. wantBlockMove=0; // Crude test for Rocket in Mac II. // BlockMove to or from a video address crashes my Radius Rocket. error=Gestalt(gestaltProcessorType,&processor); error=Gestalt(gestaltMachineType,&machine); if(machine==gestaltMacII && processor==gestalt68040)wantBlockMove=0; firstTime=0; } // Must we switch to 32-bit addressing? do32=(unsigned long)Src>0xffffffUL || (unsigned long)Dst>0xffffffUL; do32=do32 && !is32; if(do32 && !can32)return 6; // Can't use traps if we switch 24/32-bit mode. useBlockMove=wantBlockMove && !do32 && srcPixelSize==dstPixelSize; switch(srcMode){ case srcCopy: if(hMag!=1 || vMag!=1 || srcPixelSize != dstPixelSize){ if(srcPixelSize==8 && dstPixelSize==8)Expand8(hMag,vMag,hOffset,vOffset ,Src,&mySrcRect,srcRowBytes,Dst,&myDstRect,dstRowBytes,do32); else Expand(hMag,vMag,hOffset,vOffset ,Src,&mySrcRect,srcPixelSize,srcRowBytes ,Dst,&myDstRect,dstPixelSize,dstRowBytes,do32); } else if(useBlockMove)SrcCopyQuickly2(Src,srcinc,Dst,dstinc,bytes,lines,do32); else SrcCopyQuickly(Src,srcinc,Dst,dstinc,bytes,lines,do32); break; case addOverParallel: if(srcPixelSize!=dstPixelSize)return 5; AddOverParallel(Src,srcinc,Dst,dstinc,bytes,lines,do32); break; case addOver: if(srcPixelSize!=8 || dstPixelSize!=8)return 7; AddOver8(Src,srcinc,Dst,dstinc,bytes,lines,do32); break; case mulOver: if(srcPixelSize!=8 || dstPixelSize!=8)return 7; MulOver8(Src,srcinc,Dst,dstinc,bytes,lines,do32); break; default: return 1; break; } return 0; } static void Expand8(double hMag,double vMag,double hOffset,double vOffset ,register UPtr Src,Rect *srcRect,unsigned long srcRowBytes ,register UPtr Dst,Rect *dstRect,unsigned long dstRowBytes,Boolean do32) { int hh=round(hMag),vv=round(vMag); int srcWidth,dstWidth; register int i,j,ii; register UPtr src,dst; register unsigned char a; signed char mmumode=true32b; hOffset;vOffset;dstRect; /* unused arguments */ srcWidth=srcRect->right-srcRect->left; dstWidth=srcWidth*hh; for(j=srcRect->bottom-srcRect->top;j>0;j--){ src=Src; dst=Dst; if(do32)SwapMMUMode((void *)&mmumode); /* set 32-bit mode */ for(i=srcWidth;i>0;i--){ a=*src++; for(ii=hh;ii>0;ii--)*dst++=a; } if(do32)SwapMMUMode((void *)&mmumode); /* set 32-bit mode */ src=Dst; Dst+=dstRowBytes; for(ii=vv-1;ii>0;ii--){ BlockMoveData(src,Dst,dstWidth); Dst+=dstRowBytes; } Src+=srcRowBytes; } } static void Expand(double hMag,double vMag,double hOffset,double vOffset ,register UPtr Src,Rect *srcRect,int srcPixelBits,unsigned long srcRowBytes ,register UPtr Dst,Rect *dstRect,int dstPixelBits,unsigned long dstRowBytes,Boolean do32) { int hh=round(hMag),vv=round(vMag); long srcWidth,dstWidth,bytes; register int i,j,ii; register UPtr src; signed char mmumode=true32b; unsigned long value[1024],v,*srcV,*dstV; hOffset;vOffset;dstRect;do32; /* unused arguments */ srcWidth=srcRect->right-srcRect->left; dstWidth=srcWidth*hh; bytes=dstWidth*dstPixelBits/8; for(j=srcRect->bottom-srcRect->top;j>0;j--){ ReadPixels(0,0,srcWidth,value,Src,srcPixelBits,srcRowBytes); srcV=value+srcWidth-1; dstV=value+srcWidth*hh-1; for(i=srcWidth;i>0;i--){ v=*srcV--; for(ii=hh;ii>0;ii--)*dstV--=v; } WritePixels(0,0,dstWidth,value,Dst,dstPixelBits,dstRowBytes); src=Dst; Dst+=dstRowBytes; for(ii=vv-1;ii>0;ii--){ BlockMoveData(src,Dst,bytes); Dst+=dstRowBytes; } Src+=srcRowBytes; } } static void SrcCopyQuickly2(register UPtr Src,register unsigned long srcinc, register UPtr Dst,register unsigned long dstinc, unsigned long bytes,register unsigned long lines,Boolean do32) { // See discussion of BlockMoveData at top of this file. do32; /* dgp: prevent "unused argument" warning */ srcinc+=bytes; dstinc+=bytes; for(;lines>0;lines--){ BlockMoveData(Src,Dst,bytes); Src+=srcinc; Dst+=dstinc; } } #define useMask 0 static void SrcCopyQuickly(UPtr xSrc,register unsigned long srcinc, UPtr xDst,register unsigned long dstinc, register unsigned long bytes,register unsigned long lines,Boolean do32) { register unsigned long *SrcL=(unsigned long *)xSrc,*DstL=(unsigned long *)xDst; register long i; signed char mmumode=true32b; static unsigned long mask32[4]={0,0xff,0xffff,0xffffff}; register unsigned long m=mask32[bytes&3]; if(useMask){ srcinc+=bytes&3; dstinc+=bytes&3; } if(do32)SwapMMUMode((void *)&mmumode); /* set 32-bit mode */ for(;lines>0;lines--) { i=bytes>>7; switch((bytes>>2)&31){ for(;i>=0;i--){ *DstL++ = *SrcL++; case 31: *DstL++ = *SrcL++; case 30: *DstL++ = *SrcL++; case 29: *DstL++ = *SrcL++; case 28: *DstL++ = *SrcL++; case 27: *DstL++ = *SrcL++; case 26: *DstL++ = *SrcL++; case 25: *DstL++ = *SrcL++; case 24: *DstL++ = *SrcL++; case 23: *DstL++ = *SrcL++; case 22: *DstL++ = *SrcL++; case 21: *DstL++ = *SrcL++; case 20: *DstL++ = *SrcL++; case 19: *DstL++ = *SrcL++; case 18: *DstL++ = *SrcL++; case 17: *DstL++ = *SrcL++; case 16: *DstL++ = *SrcL++; case 15: *DstL++ = *SrcL++; case 14: *DstL++ = *SrcL++; case 13: *DstL++ = *SrcL++; case 12: *DstL++ = *SrcL++; case 11: *DstL++ = *SrcL++; case 10: *DstL++ = *SrcL++; case 9: *DstL++ = *SrcL++; case 8: *DstL++ = *SrcL++; case 7: *DstL++ = *SrcL++; case 6: *DstL++ = *SrcL++; case 5: *DstL++ = *SrcL++; case 4: *DstL++ = *SrcL++; case 3: *DstL++ = *SrcL++; case 2: *DstL++ = *SrcL++; case 1: *DstL++ = *SrcL++; case 0:; } } if(useMask){ if(m) *DstL=(m & *SrcL) | (!m & *DstL); }else{ if(bytes&2){ *(unsigned short *)DstL=*(unsigned short *)SrcL; DstL=(unsigned long *)(1+(unsigned short *)DstL); SrcL=(unsigned long *)(1+(unsigned short *)SrcL); } if(bytes&1){ *(unsigned char *)DstL=*(unsigned char *)SrcL; DstL=(unsigned long *)(1+(unsigned char *)DstL); SrcL=(unsigned long *)(1+(unsigned char *)SrcL); } } DstL=(unsigned long *)(dstinc+(unsigned char *)DstL); SrcL=(unsigned long *)(srcinc+(unsigned char *)SrcL); } if(do32)SwapMMUMode((void *)&mmumode); /* restore */ } static void AddOverParallel(UPtr xSrc,register unsigned long srcinc, UPtr xDst,register unsigned long dstinc, register unsigned long bytes,register unsigned long lines,Boolean do32) { register unsigned long *SrcL=(unsigned long *)xSrc,*DstL=(unsigned long *)xDst; register long i; signed char mmumode; mmumode=true32b; if(do32)SwapMMUMode((void *)&mmumode); /* set 32-bit mode */ for(;lines>0;lines--) { i=bytes>>7; switch((bytes>>2)&31){ for(;i>=0;i--){ *DstL++ += *SrcL++; case 31: *DstL++ += *SrcL++; case 30: *DstL++ += *SrcL++; case 29: *DstL++ += *SrcL++; case 28: *DstL++ += *SrcL++; case 27: *DstL++ += *SrcL++; case 26: *DstL++ += *SrcL++; case 25: *DstL++ += *SrcL++; case 24: *DstL++ += *SrcL++; case 23: *DstL++ += *SrcL++; case 22: *DstL++ += *SrcL++; case 21: *DstL++ += *SrcL++; case 20: *DstL++ += *SrcL++; case 19: *DstL++ += *SrcL++; case 18: *DstL++ += *SrcL++; case 17: *DstL++ += *SrcL++; case 16: *DstL++ += *SrcL++; case 15: *DstL++ += *SrcL++; case 14: *DstL++ += *SrcL++; case 13: *DstL++ += *SrcL++; case 12: *DstL++ += *SrcL++; case 11: *DstL++ += *SrcL++; case 10: *DstL++ += *SrcL++; case 9: *DstL++ += *SrcL++; case 8: *DstL++ += *SrcL++; case 7: *DstL++ += *SrcL++; case 6: *DstL++ += *SrcL++; case 5: *DstL++ += *SrcL++; case 4: *DstL++ += *SrcL++; case 3: *DstL++ += *SrcL++; case 2: *DstL++ += *SrcL++; case 1: *DstL++ += *SrcL++; case 0:; } } if(bytes&2){ *(unsigned short *)DstL += *(unsigned short *)SrcL; DstL=(unsigned long *)(1+(unsigned short *)DstL); SrcL=(unsigned long *)(1+(unsigned short *)SrcL); } if(bytes&1){ *(unsigned char *)DstL += *(unsigned char *)SrcL; DstL=(unsigned long *)(1+(unsigned char *)DstL); SrcL=(unsigned long *)(1+(unsigned char *)SrcL); } DstL=(unsigned long *)(dstinc+(unsigned char *)DstL); SrcL=(unsigned long *)(srcinc+(unsigned char *)SrcL); } if(do32)SwapMMUMode((void *)&mmumode); /* restore */ } static void AddOver8(register UPtr Src,register unsigned long srcinc, register UPtr Dst,register unsigned long dstinc, register unsigned long bytes,register unsigned long lines,Boolean do32) { register long i; signed char mmumode; mmumode=true32b; if(do32)SwapMMUMode((void *)&mmumode); /* set 32-bit mode */ for(;lines>0;lines--) { i=bytes>>5; switch(bytes&31){ for(;i>=0;i--){ *Dst++ += *Src++; case 31: *Dst++ += *Src++; case 30: *Dst++ += *Src++; case 29: *Dst++ += *Src++; case 28: *Dst++ += *Src++; case 27: *Dst++ += *Src++; case 26: *Dst++ += *Src++; case 25: *Dst++ += *Src++; case 24: *Dst++ += *Src++; case 23: *Dst++ += *Src++; case 22: *Dst++ += *Src++; case 21: *Dst++ += *Src++; case 20: *Dst++ += *Src++; case 19: *Dst++ += *Src++; case 18: *Dst++ += *Src++; case 17: *Dst++ += *Src++; case 16: *Dst++ += *Src++; case 15: *Dst++ += *Src++; case 14: *Dst++ += *Src++; case 13: *Dst++ += *Src++; case 12: *Dst++ += *Src++; case 11: *Dst++ += *Src++; case 10: *Dst++ += *Src++; case 9: *Dst++ += *Src++; case 8: *Dst++ += *Src++; case 7: *Dst++ += *Src++; case 6: *Dst++ += *Src++; case 5: *Dst++ += *Src++; case 4: *Dst++ += *Src++; case 3: *Dst++ += *Src++; case 2: *Dst++ += *Src++; case 1: *Dst++ += *Src++; case 0:; } } Src += srcinc; Dst += dstinc; } if(do32)SwapMMUMode((void *)&mmumode); /* restore */ } // Multiply two unsigned 8-bit pixels, and divide the product by 128. static void MulOver8(register UPtr Src,register unsigned long srcinc, register UPtr Dst,register unsigned long dstinc, register unsigned long bytes,register unsigned long lines,Boolean do32) { register long i; signed char mmumode; mmumode=true32b; if(do32)SwapMMUMode((void *)&mmumode); /* set 32-bit mode */ for(;lines>0;lines--) { i=bytes>>4; switch(bytes&15){ for(;i>=0;i--){ *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 15: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 14: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 13: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 12: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 11: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 10: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 9: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 8: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 7: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 6: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 5: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 4: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 3: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 2: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 1: *Dst = ((unsigned short)(*Dst)*(*Src++))>>7; Dst++; case 0:; } } Src += srcinc; Dst += dstinc; } if(do32)SwapMMUMode((void *)&mmumode); /* restore */ }